Haplogroup G2a3b1 (Y-DNA)

Haplogroup G2a3b1
Possible time of origin perhaps 5,000 years BP
Possible place of origin perhaps Iran or Caucasus Mtns. or Middle East
Ancestor Haplogroup G2a3b (L141)
Descendants G2a3b1, G2a3b1a, G2a3b1b, G2a3b1c
Defining mutations P303 or S135 (G2a3b1), L140 (G2a3b1a), U1 (G2a3b1a1), L13/S130 (G2a3b1a1a), L497 (G2a3b1a2) L43/S147 (G2a3b1a2a), L42/S146 (G2a3b1a2a1)

In human genetics, Haplogroup G2a3b1 (P303) is a Y-chromosome haplogroup. It is a branch of haplogroup G (Y-DNA) (M201). In descending order, G2a3b1 is additionally a branch of G2 (P287), G2a (P15), G2a3 (L30 or S126) and finally G2a3b (L141). This haplogroup represents the majority of haplogroup G men in most areas of Europe west of Russia and the Black Sea. To the east, G2a3b1-except in the Caucasus Mountains area-is just a large or small minority among G persons in such locales as Turkey, the Middle East, Iran, the southern Caucasus area, China and India.

Contents

Genetic Features

All G2a3b1 men carry the P303 or S135 SNP Y-DNA mutation. There are also some short tandem repeat (STR) findings among G2a3b1 men which help in subgrouping them. Many of the men have an unusual value of 13 for marker DYS388, and some have 9 at DYS568. STR marker oddities are often different in each G2a3b1 subgroup, and characteristic marker values can vary by subgroup. Often the values of STR markers DYS391, DYS392 and DYS393, however, are respectively 10, 11 and 14 or some slight variation on these for all G2a3b1 men.

P303 first became available for public testing in fall 2008, and the P designation indicates it was identified at the University of Arizona. The mutation is found on the Y chromosome at position 20104736. The forward primer is TTCTTATTTGCTTTGAAACTCAG. The reverse primer is ATTGGCTTATCAGATTGACG. The mutation involves replacement of T by C.[1] This mutation was actually first identified as S135 at Ethnoancestry in London, England, but it took some time to realize that P303, which was independently identified, and S135 were the same.

Dating of G2a3b1 Origin

Research studies have not yet dated the origin of G2a3b1. Based on the number of mutations seen in 67-marker STR values, this P303 mutation perhaps occurred about 5,000 years ago, and the major subgroups developed their mutations several millennia later. The spread to Europe in some subgroups seems to have occurred primarily in the period of 2,500 to 1,500 years ago based on comparisons of samples from Europe west of the Black Sea to samples from more easterly locales. Several subgroups may have originated within Europe.

General Old World Geographical Distribution

The distribution information here is based on the large collection of very likely or proven G2a3b1 samples in the Haplogroup G Project derived from multiple sources.[2]

G2a3b1 definable subgroups are heavily concentrated in Europe west of Russia and the Black Sea, but small numbers are also found in a geographical belt extending from northern Africa to northeastern Asia.

In Europe, the Baltic countries have the lowest population percentage of G2a3b1. Scandinavia is similar, showing half the percentages of G persons seen in the countries to the south.

G2a3b1 seems to represent the same percentage of the population in both central and southern Europe and usually represents half or more of the G seen in the population in these areas.

Isolated G2a3b1 samples are found also in North Africa (Tunisia, Libya and Egypt), in the Middle East (Saudi Arabia and Dubai), the Caucasus Mountains area (Armenia, Georgia, Azerbaijan, N. Ossetia in Russia, among Russian Kabardinians, Abazinia in Georgia), in Iran, in Uzbekistan and among the ethnic groups of northwesternChina and Russian Siberia. A distinctive Indian type of G2a3b1 exists, but its prevalence is unclear. And an isolated G2a3b1 sample from Malaysia exists.

Relation to High Mountain Areas?

Undocumented descriptions of the distribution of the more general G2a category in Europe (which actually would be dominated by G2a3b1) have spoken of unusual concentrations of G2a in mountainous area of Europe, such as Switzerland. But based on STR marker samples, Switzerland has about the same percentage of haplogroup G as some lower altitude European countries.[3] Sardinia has a significant amount of a unique type of haplogroup G based on STR marker values,[4] and the percentage of G in the highlands is not unusually higher than for example, in the northern coastal area.[5] There is no noticeable concentration of haplogroup G (and G2a3b1) in the high Alpine areas of Italy which has a varied landscape.[3] In addition, the high mountainous region east of the Black Sea where concentrated pockets of haplogroup G are found contains only tiny amounts of the types of haplogroup G found in Europe proper. Austria is the one European exception where haplogroup G seems more common in the western highest mountains than elsewhere, but much of Austria is mountainous [3]

Concentrations of G2a3b1 at Certain Sites

The highest percentage of G2a3b1 persons in a discrete population so far described is in the island of Ibiza off the eastern Spanish coast. All of the available haplogroup G samples from there are typical G2a3b1 samples based on STR marker values. In total, about 16% of the population is likely G2a3b1 on the same basis.[6][7]

Ibiza samples include identifiable persons from the DYS388=13 subgroup. Because value combinations of the STR marker samples in Ibiza are also common in Sephardic Jewish samples,[6] the haplogroup G in Ibiza might be related to the significant population of Crypto-Jews in Ibiza.[8]

The percentage of haplogroup G among available samples from Wales is overwhelmingly G2a3b1. Such a high percentage is not found in nearby England, Scotland or Ireland.

G2a3b1*

The asterisk indicates negativity for G2a3b1's only subgroup. This category was established in April, 2010 because of the determination then, that persons with the L140 SNP mutation comprise a separate subgroup of G2a3b1. So far, only a few samples from India, and one from Iran have been identified as belonging to this category.

G2a3b1a (L140+)

Persons in this category have the L140 SNP mutation. L140 was identified at Family Tree DNA in 2009, but the determination that not all G2a3b1 person have this mutation was not made until April, 2010. This mutation is located at chromosome position 7630859, and is a deletion.[9]

The U1+ G2a3b1a1 Subgroups

G2a3b1a1* (U1+, L13-/S13-)

A high percentage of all tested European U1+ persons so far are positive for the subgroup in which the L13 or S13 SNP mutation is present. Preliminary indications are that U1+/L13- persons may be common in the northern Caucasus region and some samples have been found in Armenia as well.

U1 was first identified at the University of Central Florida in 2006 but it was not described in a publication until 2009. The listed technical specifications are:....location rs9785956.....forward primer is TTTCTGCTCCAAATCTGCTG....reverse primer is CACCTGTAATCGGGAGGCTA....the mutation involves a change from A to G.[10]

G2a3b1a1a (U1+, L13+/S13+)

This second most populous G2a3b1 subgroup is characterized by the presence of the L13/S13 SNP. Almost all L13+ persons of European ancestry have the value of 12 or 13 at STR marker DYS385a and values of 19,20 at STR marker YCA. There are a few L13+ samples available which lack these mutations, and a shared common ancestor farther back in time from the others can be presumed for these samples.

The L13/S13 SNP was first identified at the University of Central Florida in 2006 as the U13 SNP, but prior to the publication of the details of this research in 2009,[10] the SNP was also independently identified in 2008 at Family Tree DNA in Houston, Texas, as L13 and at Ethnoancestry in England as S13 and made available for public testing. The technical specifications are given as.....Y chromosome location rs9786706.....forward primer is GTGGTAACAGCTCCTGGTGAG.....reverse primer is TGCTGCTTTGGTTAACTGTCC...the mutation involves a change from C to T.[10]

The G2a3b1a1a subgroup is most common in north central Europe and is found in almost all places in Europe where other types of G are seen, but this subgroup seems rare in almost all countries outside Europe. Some atypical marker values seen outside Europe make L13+ persons hard to identify based on marker values, and the L13+ status may be more common in such places as the Caucasus Mountain region than presently suggested. The Haplogroup G Project has also collected samples from the Middle East, Iran and Caucasus Mountain region with mutated marker values similar to the European G2a3b1a1 men, especially at STR markers DYS385 and YCA. But the similarities could be only coincidental due to lack of SNP testing.

The common ancestor of almost all European L13+ men seems to have lived at least 2,500 years ago based on comparative 67-marker STR samples, and the mutation itself may be about 3,000 years old based on the same data.

The Haplogroup G Project has indicated among its large G collection that likely or proven G2a3b1a1 STR samples comprise the following percentages of available G samples in the following European countries [in descending order]:

Germany, 16%.....Italy, 11%.....Netherlands, 10%.....France, 10%.....Poland, 9%.....Spain, 9%.....Ireland, 6%.....England, 5%.....Switzerland, 4%[11]

The L497+ G2a3b1a2 Subgroups

The largest G2a3b1a subgroup in Europe based on available samples is G2a3b1a2 in which men have the L497 mutation. This SNP was first identified in January, 2011, in testing at 23andMe and made available for separate testing at L497 by Family Tree DNA. The chromosome locations are given as 15932714 and rs35141399, and the mutation is from C to T. The forward primer is ATGAGTGGCCTCACCAAGGGAATC and reverse primer is ATGGGCAACAGGTGTCCTGAAG.

A high percentage of men with L497 have the value of 13 at STR marker DYS388. This is a rare mutation from the ancestral value of 12. A very small number of men within this DYS388=13 subgroup seem to have mutated yet again to 12 or 14. The geographical distribution of this 13 mutation and other features were first described in a research journal in 2007.[12] Percentages of DYS388=13 men within G samples are particularly high in northwestern Europe. Some DYS388=13 subgroups below are based on SNP mutations and others on STR marker value oddities.

The Haplogroup G Project has indicated among its large G collection that likely or proven STR samples from the DYS388=13 type of G2a3b1a comprise the following percentages of available G samples in the following countries [in descending order]:

Switzerland, 74%.....Spain, 60%.....France, 58%....Germany, 57%.....England, 54%....Ireland, 48%.....Netherlands, 45%.....Italy, 43%....Poland, 29%....India, 0%[11]

The Polish percentage of DYS388=13 men is diminished solely because of the origins of a significant group of G2c men in that country. Without the G2c group, the DYS388=13 percentage is 50%. The German G samples are much more numerous in the southwestern part of the country.

While DYS388=13 G persons are found in Iran, so far all such samples have been found not at all related to the DYS388=13 persons of Europe. Apparently the mutation to 13 occurred independently in other G categories in that country. The only non-European DYS388=13 sample that has surfaced from the Old World that has similar STR marker values to the Europeans is a single sample from Egypt.[13]

The age of the mutation of DYS388 to 13 seems at least 2,500 yrs old based on comparison of 67-marker STR samples. The paucity of proven samples from outside Europe so far leaves open the possibility this DYS388 mutation originated in a European.

G2a3b1a2a (L43+/S147+, L42-/S146-)

This G2a3b1a2a subgroup is rare because virtually all tested L43+/S147+ persons so far are also L42+/S146+.

The SNP that characterizes G2a3b1a2a was first identified in a listing of SNP results from testing at 23andMe. It was independently developed as a separate test by both Family Tree DNA as L43 and by Ethnoancestry as S147. In fall 2009 a test again at 23andMe provided information for the first time that a person who had the L43 mutation simultaneously lacked the L42 mutation that typically occurs with L43. This anomaly was verified by testing the same person at Family Tree DNA. So L43+/S147+ is now a separate category. The technical specifications for L43 are as follows: Y chromosome location 16446759....forward primer is GAGGTTTTCGGAGCTTACCTATAC....reverse primer is CACTGCTTGTAGATAGTAAAGTTTG.....the mutation involves change from A to G.[14]

G2a3b1a2a1 (L43+/S147+, L42+/S146+)

About a fourth of DYS388=13 men have this L42/S146 mutation. Swiss G2a3b1a men are more likely than average to belong to this subgroup. L42/S146 could be nearly as old as the DYS388=13 mutation (over 2,500 yrs.) based on the number of value differences seen in 67-marker STR samples.

The SNP that characterizes G2a3b1a2a was first identified in a listing of SNP results from testing at 23andMe. It was independently developed as a separate test by both Family Tree DNA as L42 and by Ethnoancestry as S146. The technical specifications for this SNP are as follows:....position on Y chromosome is 15170153.....forward primer is CTCACAATAGGCAGCATCCCCTCAG.....reverse primer is CAGAAAAAGGGAGCATATGACCAAGG.....the mutation involves a change from C to A.[15]

G2a3b1a2a1a (rs34136765=T+)

This mutation was found in a raw data file of a man with an English/Scottish surname at 23andMe in summer 2010. The mutation is from C to T. Another designation for the chromosome location is 15862538. This mutation so far is found only in a single person, and four other G2a3b1a2a men tested negative (ancestral) for this mutation. It is yet to be determined if this SNP is found only in the family of the positive man or has more general coverage. It is to be noted that the man with this mutation has a large loss of repeats at STR marker DYS413a. There is at least one other man genetically near to him with the same DYS413a oddity. Whichever mutation is eventually determined to be more recent will become a subgroup of the other.

L297+

This mutation was identified at Family Tree DNA in summer 2010. The mutation is from G to T. The chromosome location is 15170096. This mutation so far is found only in a single person, and seven other G2a3b1a2a men tested negative (ancestral) for this mutation. It is yet to be determined if this SNP is found only in the family of the positive man or has more general coverage.

L139+

The L139 SNP was found so far only in one DYS388=13 man, and it is uncertain if this SNP is only familial or has slightly broader coverage. For sure, L139 is not common because multiple other DYS388=13 men have tested negative for L139. The L139+ man is known negative (ancestral) for the L43 and L42 SNPs.

SNP L139 was first identified at Family Tree DNA in Houston, Texas, in mid-2009 and made available for testing at that time. The technical specifications for L139 are as follows:.....located at 13981304 on Y-chromosome.....the mutation involves change from G to A.[16]

L486+

The rs2538860 region of the Y-chromosome with a finding of A+ was determined in one DYS388=13 man during testing at 23andMe, and it is uncertain if this SNP is only familial or has slightly broader coverage. For sure, this mutation is not common since multiple other DYS388=13 men who have tested negative for this. The man with the mutation is known negative (ancestral) for the L43 and L42 SNPs. None of the commercial labs have yet provided a shortened name, such as L139, for this mutation.

The mutation at rs2538860 was first identified in a raw data file at 23andMe, in June 2010. The technical specifications are as follows:.....located at 10342008 on Y-chromosome.....the mutation involves change from C to A.

DYS391=7

This multi-value (multistep) mutation at STR marker DYS391 to the value of 7 from the original 10 is found in a group of Hispanic men. No information is available about their L42 and L43 status.

DYS464a=9

This multi-value (multistep) mutation at STR marker DYS464a to the value of 9 is found so far only in Swiss and German men. No information is available about their L42 and L43 status.

DYS388=15

This small subgroup is composed of men whose ancestor mutated two values at STR marker DYS388 to 15. Members of this subgroup must have other marker values similar to persons in the overall DYS388=13 subgroup. So far only persons of English ancestry belong to this DYS388=15 subgroup. Marker DYS388 rarely mutates, and a two-step (two-value) mutation is almost as valuable as a SNP mutation in identifying persons within this distinctive subgroup.

DYS393=12 with Genetic Nearness

This small subgroup is composed of men whose ancestor mutated at STR marker DYS393 to 12. This marker value is unusually low for G persons. The persons with this finding seem to report ancestral origins primarily in Cyprus based on current knowledge.

DYS594=12 with Genetic Nearness

While a mutation to a value of 12 from 10 or 11 is seen primarily in this group, there exist a few DYS594=12 men who do not belong to the group. The men in this group form a distinctive cluster of persons with closely related STR marker values in addition to the DYS594 oddity. This DYS594=12 subgroup has an unusually high percentage of Welsh surnames with the rest mostly of English ancestry based on available samples. Multiple persons in this group have tested negative for the L43 and L42 SNP mutations.

DYS568=9 Subgroup containing G2a3b1a3

The final major subgroup is characterized in a high percentage of cases by the values of 9 at STR marker DYS568 and less reliably 20,21 at marker YCA together with a close relationship based on STR marker values. The reason DYS568=9 can be used as a generally reliable categorization value is due to the fact this represents a multi-step mutation in a very slowly mutating marker. Although not the subject of a research study, the age of the mutation to 9 at DYS568 may have been about 3,000 yrs. ago based on the number of marker value differences of 67-marker STR samples. And the mutation to 20,21 at YCA would have arisen in this same general time period. Persons within the DYS568=9 group who were tested for the marker GATA-A10 had values one or more higher than found in other haplogroup G subgroups. Those from the Ashkenazi cluster had the highest values of 14. Additional results would be needed to determine if these findings are consistent within the DYS568=9 group.

No DYS568=9 persons have been located in the Middle East or Anatolia region where haplogroup G can be unusually common. Several samples, however, have been found among Ossetians in the central Caucasus Mountains. Though found all over Europe, DYS568=9 is so far missing from Scandinavian samples north of Denmark.

This DYS568=9 subgroup contains a further large subgroup consisting of Ashkenazi Jews who are relatively closely related based on STR marker values and typically have a value of 16 for marker DYS385b. The Jewish cluster does not seem to share a common ancestor with the non-Jewish men within the Current Era. And the common ancestor of the Ashkenazi DYS568=9 men likely lived in the Middle Ages based on the small number of STR marker value difference seen among them. See also page covering Jews with Haplogroup G (Y-DNA).

There is another, smaller subgroup of DYS568=9 persons who have the value of 9 at STR marker DYS439. The ancestral value for this marker is 12 within the DYS568=9 group, and this 9 represents a rare multi-step mutation. This DYS439=9 subgroup is predominantly German, and the mutation is probably over 2,000 years old based on number of marker value differences in 67-marker STR samples.

The Haplogroup G Project has indicated among its large G collection that likely or proven DYS568=9 samples comprise the following percentages of available G samples in the following countries [in descending order]:

Ireland, 12%.....England, 9%.....Netherlands, 5%.....Poland, 5%.....Italy, 4%.....Germany, 3%.....Spain, 3%.....France, 2%.....Switzerland, 0%[11]

Within the DYS568=9 subgroup -- which lacks its own SNP -- is subgroup G2a3b1a3 (L640+). This is a small group of men presently all from the British Isles. This SNP was identified in summer 2011 at Family Tree DNA. It represents a mutation from A to G and is found at position 16903082 on the Y chromosome. Most, if not all, these L640+ men also have the value of 8 at marker DY533 which is otherwise rare among DYS568=9 men.

G2a3b1a4 (L660+, L662+)

This G2a3b1a4 subgroup is a small one, and so far found only in Europeans. Both SNPs involved were first identified at Family Tree DNA in summer 2011. L660 is found at position 12511525 on the Y chromosome and is a change from C to A. L662 is found at position 16446702 and is a change from C to T.

G2a3b1b (L694+)

Persons in this subgroup have the L694 mutation which was discovered at Family Tree DNA in summer 2011. So far, this mutation has been found primarily in Polish men. It is located at position 5734987 on the Y chromosome and is an insertion mutation.

See also

References

  1. ^ |url=http://ymap.ftdna.com/cgi-bin/gbrowse_details/hs_chrY?name=P303;class=Sequence;ref=ChrY;start=20104736;end=20104736;feature_id=41027
  2. ^ |url=https://sites.google.com/site/haplogroupgproject/project-roster
  3. ^ a b c url=http://en.wikipedia.org/wiki/Haplogroup_G_(Y-DNA)_Country_by_Country
  4. ^ Contu, D.,et al. (2008). "Y-Chromosome Based Evidence for Pre-Neolithic Origin of the Genetically Homogeneous but Diverse Sardinian Population: Inference for Association Scans". PLoSone 1 (3): e1430. doi:10.1371/journal.pone.0001430. PMC 2174525. PMID 18183308. http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pmcentrez&artid=2174525. 
  5. ^ Zei, G., et al. (2003). "From Surnames to the history of Y chromosomes: the Sardinian population as a paradigm". Eur J of Human Genetics 11 (10): 802–07. doi:10.1038/sj.ejhg.5201040. PMID 14512971. "The Sardinian highlands do have an unusual percentage of a certain type of haplogroup I." 
  6. ^ a b Adams, S., et al. (2008). "The Genetic Legacy of Religious Diversity and Intolerance: Paternal Lineages of Christians, Jews, and Muslims in the Iberian Peninsula". Amer J of Human Genetics 83 (6): 725–36. doi:10.1016/j.ajhg.2008.11.007. PMC 2668061. PMID 19061982. http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B8JDD-4V3MRN1-5&_user=10&_rdoc=1&_fmt=&_orig=search&_sort=d&view=c&_acct=C000050221&_version=1&_urlVersion=0&_userid=10&md5=620f4d4b4e2ba5eac1b6423929b33128. 
  7. ^ Rodriguez, V., et al. (2009). "Genetic Sub-structure in Western Mediterranean Populations Revealed by 12-Chromosome STR Loci". Intl J of Legal Medicine 123 (2): 137–41. doi:10.1007/s00414-008-0302-y. PMID 19066931. 
  8. ^ |url=http://www.jcpa.org/jl/hit12.htm Survivors of the Spanish Exile--The Underground Jews of Ibiza--Gloria Mound
  9. ^ |url=http://ymap.ftdna.com/cgi-bin/gbrowse_details/hs_chrY?name=L140;class=Sequence;ref=ChrY;start=7630859;end=7630859;feature_id=40508
  10. ^ a b c Sims, L.,et al. (2009). "Improved Resolution Haplogroup G Phylogeny in the Y Chromosome, Revealed by a Set of Newly Characterized SNPs". PLoSOne 4 (6): 1–5. doi:10.1371/journal.pone.0005792. PMC 2686153. PMID 19495413. http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pmcentrez&artid=2686153. 
  11. ^ a b c |url=http://tech.groups.yahoo.com/group/HaploGNewsGrp/
  12. ^ Athey, W. (2007). "A Major Subclade of Haplogroup G2". J of Genetic Genealogy 3 (1): 14ff. http://www.jogg.info/31/athey.htm. 
  13. ^ El Sibai, M. et al. (2009). "Geographical Structure of the Y-Chromosomal Genetic Landscape of the Levant: A Coastal Inland Contrast". Annals of Human Genetics 73 (6): 568–81. doi:10.1111/j.1469-1809.2009.00538.x. PMID 19686289. 
  14. ^ |url=http://ymap.ftdna.com/cgi-bin/gbrowse_details/hs_chrY?name=L43;class=Sequence;ref=ChrY;start=16446759;end=16446759;feature_id=40931
  15. ^ |url=http://ymap.ftdna.com/cgi-bin/gbrowse_details/hs_chrY?name=L42;class=Sequence;ref=ChrY;start=15170153;end=15170153;feature_id=40874
  16. ^ |url=http://ymap.ftdna.com/cgi-bin/gbrowse_details/hs_chrY?name=L139;class=Sequence;ref=ChrY;start=13981304;end=13981304;feature_id=40794

External links